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INTRA ESTIMATION CHROMA MODE 0 SUB -BLOCK DEPENDENT PREDICTION 

Field of the Invention 

The present invention relates to digital video generally 
and, more particularly, to a method and/or circuit for implementing 
an intra estimation chroma mode 0 sub-block dependent prediction. 

Background of the Invention 

When a current block (or macroblock) is encoded/decoded 
in intra mode, a prediction block is formed based on adjacent 
samples from previously encoded/decoded and reconstructed blocks. 
The prediction block is subtracted from the current block prior to 
encoding. When the current block . is decoded in intra mode, a 
prediction block is formed based upon samples from previously 
decoded and reconstructed blocks. The prediction block is added to 
the current block following decoding. 

A prediction block for encoding and decoding chrominance 
is formed for each 8x8 chroma block. Intra Chroma Mode 0 DC 
Prediction defines 4x4 sub-blocks within each 8x8 block. The 
sub-blocks are evaluated based on the availability of reconstructed 
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samples adjacent to the 8x8 block. Conventional approaches 
restrict the availability of some predictors when encoding and 
decoding the 4x4 sub-blocks. The conventional approach selects 
which predictor to apply on all sub-blocks based on the samples 
5 available to the 8x8 block, even though some of the sub-block 
predictors do not use all of the surrounding samples. Restricting 
the predictors available can yield a prediction block that is 
visually less accurate and less efficient. 

It would be desirable to have a solution that would allow 
10 selection of the best 4x4 sub-block predictors available for the 
number of reconstructed samples available. 

Summary of the Invention 

The present invention concerns an apparatus comprising a 
15 first processing circuit and a second processing circuit. The 
first processing circuit may be configured to generate a plurality 
of reconstructed samples in response to one or more macroblocks of 
an input signal. The second processing circuit may be configured 
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to determine a best intra prediction chroma mode 0 predictor for 
each chroma sub-block of a' current macroblock in response to 
available reconstructed samples adjacent to the current macroblock. 

The objects, features and advantages of the present 
5 ' invention include providing an intra estimation chroma mode 0 sub- 
block dependent prediction scheme that may (i) improve compression 
efficiency and/or quality; (ii) uniquely evaluate predictors for 
each 4x4 sub-block of an 8 x 8 chroma block; (iii) apply the most 
accur.ate predictor for each sub-blook based upon available samples; 

10 (iv) eliminate unnecessary restrictions based on sample 
availability; (v) increase likelihood of a closer match between 
predicted and actual blocks; (vi) increase potential for selecting 
more accurate, efficient and/or higher quality, estimation for 
compression; (vii) be used with reconstructed data when generating 

15 a true predictor; (viii) be usisd when macroblock-adaptive 
field/ frame coding of H. 264 is used in conjunction with constrained 
intra prediction; and/or (ix) enable neighbor availability for each 
sub-block independently. 
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Brief Description of the Drawings 

These and other objects, features and advantages of the 
present invention will be apparent from the following detailed 
description and the appended claims and drawings in which: 
5 FIG. 1 is a block diagram illustrating encoding and 

decoding operations; 

FIG. 2 is a block diagram illustrating partitions or 
segments of pictures; 

FIG. 3 is a diagram generally illustrating an example 
10 intra prediction operation for chrominance data; 

FIG. 4 is a diagram illustrating various intra prediction 
modes available for chrominance data; 

FIG. 5 is a block diagram illustrating various components 
of a compressed video system; 
15 FIG. 6 is a block diagram illustrating an encoder in 

accordance with a preferred embodiment of the present invention; 

FIG. 7 is a more detailed diagram of the encoder of FIG. 

6; 

FIG. 8 is a block diagram illustrating a decoder in 
20 accordance with a preferred embodiment of the present invention; 
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FIG. 9 is a more detailed diagram of the decoder of FIG. 

8; 

FIG, 10 is a more detailed block diagram illustrating an 
example control circuit of FIGS. 8 and 9; and 
5 FIG. 11 is a flow diagram illustrating a predictor 

selection process in accordance with a preferred embodiment of the 
present invention. 

Detailed Description of the Preferred Embodiments 

10 Referring to FIG. 1, a block diagram is shown 

illustrating encoding and decoding operations.. In general, a data 
stream (e.g., a video stream) may comprise a series of source 
pictures 70a-n. The source pictures may also be referred to as 
images, frames, a group-of -pictures (GOP) or a sequence. The 

15 pictures generally comprise contiguous rectangular arrays of pixels 
(i.e., picture elements). Compression of digital video without 
significant quality degradation is usually possible because video 
sequences contain a high degree of: 1) spatial redundancy, due to 
the correlation between neighboring pixels, 2) spectral redundancy, 

20 due to correlation among the color components, 3) temporal 
redundancy, due to correlation between video frames, and 4) psycho- 
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visual redundancy, due to properties of the human visual system 
(HVS) . 

Video frames generally comprise three rectangular 
matrices of pixel data representing a luminance signal (e.g., luma 
Y) and two chrominance signals (e.g., chroma Cb and Cr) that 
correspond to a decomposed representation of the three primary 
colors (e.g., Red, Green and Blue) associated with each picture 
element . The most common format used in video compression 
standards is eight bits and 4:2:0 sub-sampling (e.g., the two 
chroma components are reduced to one -half the vertical and 
horizontal resolution of the luma component) , However, other 
formats may be implemented to meet the design criteria of a 
particular application. 

Each picture may comprise a complete frame of video 
(e.g., a frame picture) or one of two interlaced fields from an 
interlaced source (e.g., a field picture). The field picture 
generally does not have any blank lines between the active lines of 
pixels. For example, if the field picture is viewed on a normal 
display, the field picture would appear short and fat. For 
interlaced sequences, the two fields may be encoded together as a 
frame picture. Alternatively, the two fields may be encoded 

6 
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separately as two field pictures. Both frame pictures and field 
pictures may be used together in a single interlaced sequence. 
High detail and limited motion generally favors frame picture 
encoding. In general, field pictures occur in pairs (e.g., 
top/bottom, odd/even, f ieldl/f ield2 ) . The output of a decoding 
process for an interlaced sequence is generally a series of 
reconstructed fields. For progressive scanned sequences, all 
pictures in the sequence are frame pictures. The output of a 
decoding process for a progressive sequence is generally a series 
of reconstructed frames. . 

The source pictures 70a-n may be presented to an encoder 
72. The encoder 72 may be configured to generate a . series of 
encoded pictures 74a-n in response to the source pictures 70a-;n, 
respectively. For example, the encoder 72 may be configured to 
generate the encoded pictures 74a-n using a compression standard 
(e.g., MPEG-2, MPEG-4, H.264, etc.). In general , encoded pictures 
may be classified as intra coded pictures (I), predicted pictures 
(P) and bi -predictive pictures (B) . Intra coded pictures are 
generally coded without temporal prediction. Rather, intra coded 
pictures use spatial prediction within the same picture. For 
example, an intra coded picture is generally coded using 
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information within the corresponding source picture (e.g., 
compression using spatial redundancy) . An intra coded picture is 
generally used to provide a receiver with a starting point or 
reference for prediction. In one example, intra coded pictures may 
be used after a channel change and to recover from errors. 

Predicted pictures (e.g. , P-pictures or P- frames) and bi- 
predictive pictures (e.g., B-pictures or B-frames) may be referred 
to as inter coded. Inter coding techniques are generally applied 
for motion estimation and/or motion compensation (e.g. , compression 
using temporal redundancy) . P-pictures and B-pictures may be coded 
with forward prediction from references comprising previous I and 
P pictures. For example, the B-picture 74b and the P-picture 74c 
may be predicted using the I -picture 74a (e.g., as indicated by the 
arrows 76 and 78, respectively) . The B-pictures may also be coded 
with (i) backward prediction from a next I or. P-reference picture 
(e.g., the arrow 80) or (ii) interpolated prediction from both past 
and future I or P-references (e.g., the arrows 82 a and 82b, 
respectively) . However, portions of P and B-pictures may also be 
intra coded or skipped (e.g., not sent at all) . When a portion of 
a picture is skipped, the decoder generally uses the associated 
reference picture to reconstruct the skipped portion with no error. 

8 
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However, the concept of what particular pictures may 
reference what other particular pictures may be generalized in a 
particular compression standard (e.g., H.264). For example, P- 
pictures may reference temporally forward or backward. B-pictures 
may have similar forward or backward references. The restriction 
is generally not time, but rather how many frames are stored in a 
buffer so that the frames may be decoded in a different order than 
the frames are displayed. In one example, the frames may be 
referenced forward in time. In another example, the frames may be 
referenced backward in time (e.g., re-ordering the frames). 

In one example, a B-frame may differ from a P-frame in 
that a B-frame may do interpolated prediction from any two 
reference frames. Both reference frames may be (i) forward in 
time, (ii) backward in time, or (iii) one in each direction. B- 
pictures can be, and are expected to often be, used as prediction 
references in H.264. In many cases an important distinction is 
between reference and non-reference frames. 

The encoded pictures 74a-n may be presented to a decoder 
84. The decoder 84 is generally configured to generate a series of 
reconstructed pictures corresponding to the source pictures 70a-70n 
(e.g., images, frames, fields, etc.) in response to the encoded 
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pictures. In one example, the decoder 84 may be implemented within 
the encoder 72 and the reconstructed pictures may be used in the 
prediction operations of the encoding process. 

Referring to FIG. 2, a block diagram is shown generally 
illustrating partitions or segments of pictures. In general, a 
picture (e.g., an image, a frame, a field, etc.) 7pi may be divided 
(e.g., segmented, partitioned, etc.) into a number of macroblocks 
86, The macroblocks generally comprise an array of pixels having 
vertical and horizontal dimensions of equal size (e.g., 32 x 32, 16 
X 16, etc) . The macroblocks generally comprise luminance data 
(e.g., luma Y) and chrominance data (e.g., blue chroma Cb and red 
chroma Cr) . In one example, the luminance data may have a 
resolution that is twice that of the chrominance data (e.g., a 
4:2:0 format ) . . . 

The macroblocks 86 may be grouped in a number of slices 
90. The slices 90 may comprise an arbitrary number of macroblocks 
86. The slices 90 generally run from left to , right and may 
comprise an entire row of the picture 70i. However, a slice 90 may 
comprise less than or more than an entire row of macroblocks 86 
(e.g., H-264 compliant) . In one example, a slice 90 may be defined 
as a particular number of macroblocks 86 grouped together. For 

10 
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broadcast profiles, the macroblocks 86 in a slice 90 are generally 
consecutive macroblocks in raster scan order. However, for 
streaming and/or video-conferencing applications, a map may be sent 
identifying which scattered macroblocks are grouped together in a 
5 slice. A compression standard (e.g., H.264) may also provide an 
option of using macroblocks or macroblock pairs. A macroblock pair 
comprises two macroblocks located one above the other. When 
macroblock pairs are used, a slice or row generally comprises 
macroblock pairs rather than macroblocks. 

10 In one example, the macroblock 86 may be implemented as 

a 16 X 16 block. The macroblock 86 may be encoded in an inter 
prediction mode (e.g., compression based upon temporal redundancy) 
or an intra prediction mode (e.g., compression based upon spatial 
redundancy) . In the inter prediction mode, each 16 x 16 macroblock 

15 86 may be predicted with a single 16 x 16 vector (e.g., mode 1) . 
Alternatively, the macroblock 86 may be segmented into two 16 x 8 
blocks (e.g., mode 2) or two 8 x 16 blocks (e.g., mode 3) , in which 
case two motion vectors may be generated for predicting the 
macroblock 86. The macroblock 86 may also be segmented into four 

20 8 x 8 blocks (e.g., mode 4), in which case four motion vectors may 
be generated for the macroblock 86. When the macroblock 86 is 
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segmented into the four 8x8 blocks (e.g., mode 4), each 8x8 
block may be optionally further segmented into two 4x8 sub-blocks 
(e.g., mode 5),, two 8x4 sub-blocks (e.g., mode 6) or four 4x4 
sub-blocks (e.g., mode 7). An encoder generally decides which 
5 "mode'' to use for encoding each macroblock 86. For example, an 
error score may be computed based on a closeness of match 
determination for each mode, with the modes that use more vectors 
being penalized (e.g., by increasing the respective error score) 
because of the additional bits that it will take to encode the 

10 motion vectors. 

When a block or macroblock is to be encoded in the intra 
prediction mode, a prediction block is generally formed based upon 
previously decoded and reconstructed blocks. In an encoder, the 
prediction block is generally subtracted from the current block 

15 prior to encoding. In a decoder, the prediction block is generally 
added to the current block prior to filtering. For luminance (or 
luma) samples, the prediction block may be formed for either each 
4x4 sub-block in the macroblock or for the entire 16 x 16 
macroblock. When each 4x4 luma block is to be predicted, any 

20 available one of nine prediction modes may be used for each 4x4 
chroma block. When the entire macroblock (e.g., a 16 x 16 chroma 
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block) is to be encoded, any of four available prediction modes may 
be used. 

For chrominance (or chroma) samples, the prediction block 
may be formed for either each 4x4 chroifia sub-block in an 8 x 8 
chroma block of the macroblock (e.g., for a macroblock in 4:2:0 
format) or for the entire 8x8 chroma block. Both chroma Cb and 
chroma Cr blocks are generally processed similarly. When each 4 x 
4 chroma sub-block is to be predicted, one of four prediction modes 
may be used for each 4x4 chroma sub-block (e.g., DC or mode 0, 
vertical or mode 1, horizontal or mode 2, .and plane or mode 3). 

Referring to FIG. 3, a diagram illustrating an intra 
prediction operation for an 8 x 8 chroma block is shown. For each 
8x8 chroma block 91 to be predicted in a current (or source) 
slice 92, a top edge 93 and a left edge 94 are generally 
determined. The top edge 93 and the left edge 94 of the 8x8 
chroma block are used to determine whether reconstructed samples in 
a reconstructed slice 95 that are above and to the left of the 8 x 
8 chroma block (e.g., samples a-p) have been decoded and 
reconstructed (e.g., are available). If the samples a-p are 
available, four prediction blocks 96 (e.g., A-D) may be generated 
using sums (e.g., S0-S3) of the reconstructed samples a-p adjacent 

13 
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to each 4x4 prediction block A-C. Each of the sums SO -S3 
generally represents the sum of four of the reconstructed samples 
bordering the chroma block at the location indicated in FIG. 3. 
For example, S0=sum(a-d) , Sl=:sum(e-h) , S2=sum(i-1) and S3=sum(m-p) . 
5 The predictions for all chroma samples within a block (A, B, C or 
D) are generally dependent upon the presence of the pixels used to 
compute the sums S0-S3. However, one or more of the sums S0-S3 may 
be unavailable by being outside the current picture or slice or 
because constrained intra prediction is in use and the neighboring 

10 blocks are not intra coded. 

Specifically, not all of the samples a-p may be available 
within the current reconstructed slice 95. In general, only 
previously encoded/decoded samples within a current reconstructed 
slice are considered available for intra prediction in order for 

15 slices to be independently decoded. In addition, some modes of 
operation (e.g., a constrained intra mode) may consider only 
macroblocks coded in an intra prediction mode within a slice to be 
available (e.g., for the constrained intra mode only other intra 
macroblocks within the slice are considered available, inter coded 

20 macroblocks are considered unavailable). Also, in H.264, a slice 
may not always be independently decoded from other slices. For 
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example, a loop (or deblocking) filter may operate between slices. 
However, the decoding process for pixels, up to but not including 
the deblocking filter portion of the decoding process, may be 
independently decoded in the various slices. In general, intra 
5 prediction is performed on the decoded samples prior to the 
deblocking filter process. 

In general, adjacent (or neighboring) samples refers to 
reconstructed samples in a line directly above or to the left of 
the current block. For field coded pictures and frame coded 

10 pictures the meaning of neighboring/adjacent is very simple: 
vertically adjacent samples are in the line above in the picture 
(which may be either a frame or a field) , and horizontally adjacent 
samples are the line to the left in the picture (which may be 
either a frame or a field) . However, with macroblock adaptive 

15 field/frame (MB-AFF) coded .pictures (e.g., particularly when using 
constrained intra prediction) , the samples considered to be 
adjacent for intra prediction depend on the mode of the current 
macroblock. For example, when processing a frame macroblock, the 
adjacent samples generally comprise samples that are adjacent to 

20 the current block with the picture samples arranged as a frame. 
When processing a field macroblock, the adjacent samples generally 
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comprise the samples that are adjacent to the current block with 
the picture samples, arranged as the same parity field as the 
current macroblock. In general, the sums SO and SI are either (i) 
both available or (ii) both unavailable in all picture types since 
the macroblock above the current block is either available or 
unavailable. However, in MB-AFF coded pictures, it is possible to 
have only one of the sums S2 and S3 available. 

For example, with MB-AFF coding and constrained intra 
prediction, if a left adjacent macroblock pair are coded with one 
FRAME macroblock intra predicted and the other FRAME macroblock not 
intra predicted (e.g., inter predicted) , only one-half (either the 
top 4 or the bottom 4) of the neighboring samples (and therefore 
only one of the sums S2 or S3) may be available for prediction for 
the FIELD macroblocks in the current macroblock pair. In. another 
example, when a field macroblock pair is' to the left of a current 
frame macroblock pair, if one of the left macroblocks is not 
available (e.g., due to being non-intra predicted), BOTH of the 
sums S2.and S3 are generally not available for both macroblocks in 
the current macroblock pair. In general, the present invention 
provides for separately determining the availability of the 
individual sums (e.g., S0-S3) that make up the adjacent samples. 

16 
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Referring to FIG. 4, a diagram illustrating various intra 
prediction modes for chroma blocks is shown. In general, each 8 x 
8 chroma component of a macroblock is predicted from reconstructed 
chroma samples above (e.g., H samples) and/or to the left (e.g., V 
samples) that have previously been encoded/decoded and 
reconstructed. Four prediction modes are generally available for 
intra prediction of chroma samples: a mode 0 or DC prediction, a 
mode 1 or horizontal prediction, a mode 2 or vertical prediction 
and a mode 3 or plane prediction. In general, when an 8 x 8 block 
is coded in intra mode, both chroma Cb and chroma Cr blocks are 
intra coded. 

Referring to FIG. 5, a block diagram of a system 100 is 
shown. In general, a content provider 102 presents video image, 
audio or other data 104 to be compressed and transmitted to an 
input of an encoder 106. The compressed data 108 from the encoder 
106 may be presented to an encoder transport system 110. An output 
of the encoder transport system 110 generally presents a signal 112 
to a transmitter 114. The transmitter 114 transmits the compressed 
data via a transmission medium 116. The content provider 102 may 
comprise a video broadcast, DVD, or any other source of video data 
stream. The transmission medium 116 may comprise a broadcast, 

17 
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cable, satellite, network, DVD, hard drive, or any other medium 
implemented to carry, transfer, and/or store a compressed 
bitstream. 

On a receiving side of the system 100, a receiver 118 
generally receives the compressed data bitstream from the 
transmission medium 116. The receiver 118 presents a bitstream 120 
to a decoder transport system 122. The decoder transport system 
122 generally presents the bitstream via a link 124 to a decoder 
126. The decoder 126 generally decompresses the data bitstream and 
presents the data via a link 128 to an end user 130. The end user 
130 may comprise a television, monitor, computer, projector, hard 
drive, or any other medium implemented to carry, transfer, present, 
display and/or store an uncompressed bitstream. 

Referring to FIG. 6, a block diagram illustrating an 
encoder 10 6 in accordance with a preferred embodiment of the 
present invention is shown. The encoder 106 may be implemented, in 
one example, as an H.264 compliant encoder. The encoder 106 
generally comprises a processing block 132 and a processing block 
134. The encoder 106 may also comprise an encoding block 136. The 
processing block 132 may be implemented as a general processing 
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block. The processing block 134 may be implemented as an intra 
prediction chroma processing block. 

The general processing block 132 may have an input 140 
that may receive a signal (e.g., INPUT). The signal INPUT 
generally comprises an uncompressed digital video signal comprising 
a series of pictures (e.g., frames, fields, etc.). Each picture 
generally comprises a representation of a digital video signal at 
a particular time. The general processing block 132 may be 
configured to generate a plurality of macroblocks from each 
picture. The general processing block 132 may also have an output 
142 that may present one or more signals (e.g., CTRl) to an input 
144 of the encoding circuit 136. 

The encoding circuit 136 may have an output 146 that may 
present a signal (e.g., OUTPUT). The signal OUTPUT may be a 
compressed and/or encoded bitstream, such as an H.264 compliant 
digital video bitstream. In one example, the encoding circuit 136 
may be configured to perform entropy coding. The circuit 13 6 may 
be further configured to provide serialization (e.g., zig-zag scan) 
and re-ordering of the transformed and quantized pictures. 

The general processing circuit 132 may have an output 150 
that may present one or more signals (e.g., INTl) to an input 152 

19 
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of the intra prediction chroma processing block 134. Similarly, 
the intra prediction chroma processing block 134 may have an output 
154 that may present a signal (e.g., INT2) to an input 156 of the 
general processing block 132, an output 158 that may present a 
signal (e.g., PRED) to an input 160 of the general processing block 
132 and an input 162 that may receive the signal INPUT. The signal 
INTl may comprise, in one example, previously encoded/decoded and 
reconstructed samples of the pictures in the signal INPUT. The 
signal INT2 may comprises, in one example, mode information 
regarding prediction samples generated by the block 134. The 
signal PRED generally comprises one or more prediction samples 
related to each picture. 

Referring to FIG. 7, a more detailed diagram of the 
encoder 106 of FIG. 6 is shown. The intra prediction processing 
block 134 generally comprises a block (or circuit) 164 and a block 
(or circuit) 166. The circuit 164 may be implemented, in one 
example, as a control circuit. The circuit 166 may be implemented 
as a picture element chroma processing block. The circuit 164 may 
be configured to determine available intra prediction modes in 
response to the signals INPUT and INTl. In particular, the circuit 
164 may be configured to determine availability of reconstructed 

20 
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samples for each 4x4 chroma block to be intra predicted. The 
circuit 164 may be configured to generate a signal (e.g., 
PREDICTORS) in response to the signals INPUT and INTl . In one 
example/ the signal PREDICTORS may be implemented as one or more 
individual control signals. Alternatively, the signal PREDICTORS 
may be implemented as a multibit signal, where each bit may be used 
as a control signal. In one example, the signal PREDICTORS may be 
configured to indicate available parameters for intra prediction 
chroma mode . 0 . 

The circuit 166 may be configured to generate prediction 
blocks for each 4x4 chroma block to be encoded. The circuit 166 
may be configured to receive the signals INPUT, INTl and 
PREDICTORS. The circuit 166 may be configured to generate the 
signals INT2 and PRED in response to the signals INPUT, PREDICTORS 
and INTl. 

,The circuit 132 generally comprises a block (or circuit) 
170, a block (or circuit) 172, a block (or circuit) 173, a block 
(or circuit) 174, a block (or circuit) 176, a block (or circuit) 
177, a block (or circuit) 178, a block (or circuit) 180, a block 
(or circuit) 182, a block (or circuit) 184, a block (or circuit) 
186 and a block (or circuit) 188. The circuit 170 may be 
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implemented as an inter prediction processing circuit. The circuit 

172 may be implemented as a motion estimation circuit. The circuit 

173 may be implemented as a deblocking (or loop) filter. The 
circuit 174 may be implemented as a picture memory circuit. The 

5 circuit 176 may be implemented as a selection circuit, such as a 
2:1 multiplexer. The circuit 177 may be implemented as a summing 
circuit. The circuit 178 may be implemented as a transform 
circuit. In one example, the circuit 178 may be configured to 
perform an 4 x 4 integer transform or a discrete cosine transform 

10 (DCT) . The circuit 180 may be implemented as a control circuit. 
The circuit 182 may be implemented as a quantization circuit. The 
circuit 184 may be implemented as an inverse quantization circuit." 
The circuit 186 may be implemented as an inverse transform circuit. 
The circuit 188 may be implemented as a summing circuit. 

15 An output of the quantization circuit 182, an output of 

the motion estimation circuit 172, an output of the inter 
processing circuit 170 and the signal INT2 may be presented as the 
signal CTRl at the output 142. The inverse quantiz:ation circuit 
184 is generally configured to reverse the quantization process 

20 performed by the .quantization circuit 182. The inverse transform 
circuit 186 is generally configured to reverse the transformation 
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process (e.g., DCT or 4 x 4 integer) performed by the circuit 178. 
The inverse transform circuit 186 may also be referred to as an 
inverse DCT block or an IDCT block. 

The signal INPUT may be presented to the inter prediction 
processing block 170, the motion estimation block 172 and the 
summing block 177. The summing block 177 may mathematically 
combine the signal INPUT with either (i) the output of the inter 
prediction processing block 170 or (ii) the output of the block 
134. The selection may respond to a signal provided by the control 
circuit 180. The signal INPUT may be compressed with the transform 
circuit 178. The transform circuit 178 may translate the 
macroblocks in the signal INPUT from time domain frames to 
frequency domain frames. The quantization block 182 may reduce the 
number of bits in a number of coefficients representing the signal 
INPUT. The encoding block 136 may provide entropy coding (e.g., 
Huffman coding, binary arithmetic coding, context adaptive binary 
arithmetic coding or CABAC, etc.) to implement a lossless 
compression having frequent values represented in fewer bits. 

The inverse quantization circuit 184 and the inverse 
transform circuit 186 may be used to decode the encoded 
macroblocks. The summing block 188 may provide a mathematical 
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operation to sum the decoded macroblocks with the predicted 
macroblocks to form reconstructed macroblocks. By reconstructing 
the macroblocks, the processing block 132 generally ensures that 
the prediction processing is based upon the same reference as would 
be available during decoding (e.g., reduces drift). 

Referring to FIG. 8, a block diagram illustrating a 
decoder 12 6 in accordance with a preferred embodiment of the 
present invention is shown. The decoder 126 may be implemented, in 
one example, as an H.2 64 compliant decoder. The decoder 126 
generally comprises a decoding block 190, a processing block 192 
and a processing block 194. The decoding block 190 may be 
implemented as an entropy decoding block. The decoding block 190 
may be further configured to re-order and deserialize information 
contained in the signal COMPRESSED. The processing block 192 may 
be implemented as a general processing block. The processing block 
194 may be implemented as an intra prediction chroma processing 
block. In one example, the block 194 may be implemented similarly 
to the block 134 of the encoder 106 (described above in connection 
with FIGS. 6 and 7) . - - 

The decoding block 190 may have an input 196 that may 
receive the signal COMPRESSED and an output 198 that may present a 

24 
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number of coefficients to (i) an input 200 of the circuit 192 and 
(ii) an input 202 of the circuit 194. The coefficients generally 
represent a digital video signal comprising a series of pictures 
(e.g., frames, fields, etc.). Each picture generally comprises a 
■representation of a digital video signal at a particular time. The 
general processing block 192 may be configured to generate a 
plurality of reconstructed macroblocks from each picture. The 
general processing block 192 may also have an output 204 that may 
present a signal (e.g., UNCOMPRESSED) . The signal UNCOMPRESSED may 
comprise a reconstruct digital video signal. 

The general processing circuit 192 may havei an output 206 
that may present one or more signals (e.g., INTl) to an input 208 
of the intra prediction chroma processing block 194. Similarly, 
the intra prediction chroma processing block 194 may have an output 
210 that may present a signal (e.g., PRED) to an input 212 of the 
general processing block 192 . The signal INTl may comprise, in one 
example, previously encoded/decoded and reconstructed samples of 
the pictures reconstructed from the signal COMPREI^SED. The signal 
PRED generally comprises one or more prediction samples related to 
each picture. 
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Referring to FIG. 9, a more detailed diagram of the 
decoder 12 6 of FIG. 8 is shown. The intra prediction chroma 
processing block 194 generally comprises a block (or circuit) 214 
and a block (or circuit) 216. The circuit 214 may be implemented, 
in one example, as a control circuit. The circuit 216 may be 
implemented as a picture element chroma processing block. The 
circuit 214 may be configured, in one example, to determine 
available predictors for intra prediction chroma mode 0 in response 
to the signals INPUT and INTl . In particular, the circuit 214 may 
be configured to determine availability of reconstructed samples 
used to generate sums for each 4x4 chroma block intra prediction. 
The circuit 214 may be configured to generate a signal (e.g., 
PREDICTORS) in response to the signals INPUT and INTl. In one 
example, the signal PREDICTORS may be implemented as one or more 
individual control signals. Alternatively, the signal PREDICTORS 
may be implemented as a multibit signal, where each bit may be used 
as a control signal. In one example, the signal PREDICTORS may be 
configured to indicate available parameters for intra prediction 
chroma mode 0 as defined in the H.264 standard. 

The circuit 216 may be configured to generate prediction 
blocks for each 4x4 chroma block to be decoded. The circuit 216 
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may be configured to receive the signals INPUT, INTl and 
PREDICTORS. The circuit 216 may be configured to generate the 
signal PRED in response to the signals INPUT, PREDICTORS and INTl. 

The circuit 192 generally comprises a block (or circuit) 
220, a block (or circuit) 222, a block (or circuit) 224, a block 
(or circuit) 226, a block (or circuit) 228, a block (or circuit) 
230, and a block (or circuit) 232. The circuit 220 may be 
implemented as an inter prediction processing circuit. The circuit 
222 may be implemented as a filter circuit. In one example, the 
circuit 222 may be configured as a deblocking filter. The circuit 
224 may be implemented as a picture memory circuit. The circuit 
226 may be implemented as a selection circuit, such as a 2:1 
multiplexer. The circuit 228 may be implemented as an inverse 
quantization circuit. The circuit 230 may be implemented as an 
inverse transformation circuit. In one example, the circuit 230 
may be configured to perform an inverse 4x4 integer transform or 
a inverse discrete cosine transform (IDCT) . The circuit 232 may be 
implemented as a summing circuit. 

An output of the decoding circuit 190 may be presented 
to an input of the inverse quantization circuit 228. The inverse 
quantization circuit 228 is generally configured to reverse the 
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quantization process performed when the signal COMPRESSED was 
encoded. An output of the circuit 228 may be presented to an input 
of the inverse transform circuit 230. The inverse transform 
circuit 23 0 is generally configured to reverse the transformation 
process (e.g., DCT or 4 x 4 integer) performed when the signal 
COMPRESSED was encoded. 

An output of the inverse transform circuit 230 may be 
presented to the summing circuit 232. The summing block 232 may be 
configured to mathematically combine the output of the inverse 
transform circuit 230 (e.g., decoded macroblocks) with predicted 
blocks from either (i) an output of the inter prediction processing 
block 22 0 or (ii) a output of the block 194. An output (e.g., 
decoded and reconstructed macroblocks) of the summing circuit 232 
is generally presented to the picture memory 224. The memory 224 
may present the reconstructed macroblocks (i) to the circuit 192 in 
the signal INTl and (ii) to the filter block 222. The filter 222 
may be configured to present filtered reconstructed macroblocks as 
references to the inter prediction processing block 22 0. 

Referring to FIG. 10, a more detailed block diagram 
illustrating an example implementation of the control blocks 164 
and 214 of FIGS. 6 and 8 is shown. The control blocks 164 and 214 
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may comprise a block (or circuit) 240, a block (or circuit) 242 and 
a block (or circuit) 244. The block 240 may be implemented as a 
block location detection circuit. The block 242 may be implemented 
as a picture memory access block. The block 244 may be implemented 
as a logic block. The block 240 may have an output that may 
present a signal (e.g., OFFSET) to an input of the block 242. The 
signal OFFSET may comprise, in one example, coordinates within the 
current slice of an upper left corner of a current 8x8 chroma 
block to be predicted. The circuit 240 may be configured to 
determine the position of the current chroma block to be 
encoded/decoded within the current slice. For example, the circuit 
240 may be configured to determine the X,Y coordinates of the upper 
left corner of the current 8x8 chroma block. 

The circuit 242 may be configured to determine the 
availability of previously encoded and reconstructed samples for 
generating the sums S0-S3 for the prediction of the current block 
(e.g., as illustrated in FIG. 3) in response to the signal OFFSET. 
In general, the circuit 242 may be configured to examine the 
picture memory 174 or 224 for the availability of the reconstructed 
samples adjacent to the current chroma block (e.g., represented by 
the signal INTl) . The circuit 242 may be configured to generate a 
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number of signals representing the available sums (e.g., SO, Si, 
S2, and S3) in response to the signals OFFSET and INTl . In one 
example, the signals S0-S3 may be configured to indicate the 
corresponding predictor for each 4x4 sub-block of the current 8 
X 8 chroma block. Alternatively, the circuit 242 may be configured 
to retrieve the reconstructed samples from the picture memory 174 
or 224. In one example, the circuit 242 may be configured to 
provide the reconstructed samples to the circuit 246 for use in 
generating the prediction block. 

The circuit 184 may be configured to generate the signal 
PREDICTORS in response to the signals SO -S3 received from the 
circuit 242. In one example, the circuit 244 may be implemented as 
combinational logic (e.g., in an application specif ic integrated 
circuit or ASIC) or as a sequence of computer executable 
instructions (e.g., a software implementation). The circuit 244 
may be configured to logically combine the signals SO -S3 to 
generate the prediction for chroma mode 0 of each intra prediction 
4x4 chroma sub-block. 

Referring to FIG. 11, a flow diagram 300 illustrating an 
"example intra prediction chroma mode 0 operation in accordance with 
a preferred embodiment of the present invention is shown. When 
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intra prediction chroma mode 0 is selected (e.g., the block 302), 
the current slice is generally checked to determine whether 
reconstructed samples from a macroblock (or macroblock pair) 
adjacent to a top edge of the current chroma block (e.g., samples 
P(x, -1), where x = 0-7) and samples for a macroblock (or 
macroblock pair) adjacent to a left edge of the current chroma 
block (e.g.., samples P(-l, y) , where y = 0-7) are available (e.g.,. 
the block 304) . The sums SO -S3 are generated depending upon the 
available samples (e.g., the block 306), 

When none of the sums S0-S3 can be generated, the sub- 
blocks A-D are generally predicted using a predictor with a 
predetermined value (e.g., the blocks 308 and 310). In one 
example, the predetermined value may correspond to a median chroma 
value (e.g., 128). When one or more of the sums S0-S3 are 
available, the best predictor for each sub-block (e.g., A-D) for 
the sums available may be selected (e.g., the block 312) . 

A summary of the example predictors may be summarized as 
in the following TABLE 1: 
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TABLE 1 



Block 


PredO 


Predl 


Pred2 


Pred3 


A 


(SO + S2 + 4)/8 


(SO + 2)/4 


(S2 + 2)/4 


128 


B 


(SI + 2)/4 


(SI + 2) /4 


(S2 + 2)/4 


128 


C 


(S3 + 2)/4 


(SO + 2) /4 


(S3 + 2)/4 


128 


D 


(SI + S3 + 4) /8 ' 


(SI + 2)/4 


(S3 + 2)/4 


128 



In general, block predictors for each sub-block A-D may be 
generated according to the left -most formula in the TABLE 1 for 
which the corresponding sums SO -S3 are available. For example, 
predictors in the column PredO are generally better than predictors 
in the column Predl, predictors in the column Predl are generally 
better than preciictors in the column Pred2 and the predictors in 
the column Pred2 are generally better than the values in the column 
PredS. 

The function performed by the flow diagram of FIG- 11 may 
be implemented using a conventional general purpose digital' 
computer programmed according to the teachings of . the present 
specification, as will be apparent to those skilled in the relevant 
art(s). Appropriate software coding can readily be prepared by 
skilled programmers based on the teachings of the present 
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disclosure, as will also be apparent to those skilled in the 
relevant art(s). 

The present invention may also be implemented by the 
preparation of application specific integrated circuits (ASICs) , 
field programmable gate arrays (FPGAs) , or by interconnecting an 
appropriate network of conventional component circuits, as is 
described herein, modifications of which will be readily apparent 
to those skilled in the art(s) . . 

The present invention thus may also include a computer 
product which may be a storage medium including instructions which 
can be used to program a computer to perform a process in 
accordance with the present invention. The storage medium can 
include, but is not limited to, any type of disk including floppy 
disk, optical disk, CD-ROM, and magneto-optical disks, ROMs, RAMs, 
EPROMs, EEPROMs, Flash memory, magnetic or optical cards, or any 
type of media suitable for storing electronic ' instructions . . . 

The present invention may improve compression efficiency 
and quality by uniquely evaluating predictors for each 4x4 chroma 
sub-block. In the conventional approach, the sample availability 
for the entire 8x8 chroma block was used to determine which 
predictors to apply to the 4x4 sub-blocks. However, sub-block 
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predictors do not generally depend on the same samples. Therefore, 
some sub-blocks may benefit from a more accurate predictor, 
provided the appropriate reconstructed samples are available, even 
if the equivalently accurate predictor for another sub-block is 
invalid due to the corresponding samples being unavailable. 

The present invention generally improves compression 
efficiency and quality by uniquely evaluating predictors for each 
sub-block. For example, by introducing a block dependency, the 
most accurate predictor may be applied wherever possible. 

The present invention generally provides an advantage of 
sub-block dependent intra prediction that eliminates unnecessary 
restrictions based on sample availability (e.g., as applied to the 
entire 8x8 block) . Predictors are generally determined uniquely 
for each 4x4 sub-block using as many available samples as 
possible. The present invention generally increases the likelihood 
of a closer match between a predicted block and the actual image. 
The present invention may increase the potential for selection of 
a more accurate, efficient, and higher quality estimation mode for 
compression. 

The present invention may be used with reconstructed data 
when generating a true predictor. For example, the present 
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invention may be used when the Macroblock- Adaptive Field/Frame 
coding tool of the H.264 standard is used in conjunction with 
Constrained Intra Prediction (e.g., defined within clause 8.3 of 
the H.264 specification) . When Constrained Intra Prediction is 
5 used, pixels from inter predicted' macroblocks are generally 
considered not available for intra prediction of other macroblocks. 
The lack of availability of pixels from inter predicted macroblocks 
may lead to different availability of the left -neighbors of the 
chroma sub-blocks of a field- coded macroblock. The present 

10 invention generally enables neighbor availability for each sub- 
block independently, allowing more efficient prediction, and 
ultimately improving the possible compression efficiency of an 
H.264 video encoder. 

• While the invention has been particularly shown and 

15 ' described with reference to the .preferred . embodiments. thereof , it 
will be understood by those skilled in the art that various changes 
in form and details may be made without departing from the spirit 
and scope of the invention. 
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